Evaluation of mel-LPC cepstrum in a large vocabulary continuous speech recognition
نویسندگان
چکیده
This paper presents a simple and e cient time domain technique to estimate an all-pole model on the melfrequency scale (Mel-LPC), and compares the recognition performance of Mel-LPC cepstrum with those of both the standard LPC mel-cepstrum and the MFCC through the Japanese dictation system (Julius) with 20,000 word vocabulary. First, the optimal value of frequency warping factor is examined in terms of monosyllable accuracy. When using the optimal warping factors, Mel-LPC cepstrum attains the word accuracies of 93.0% for male speakers and 93.1% for female speakers, which are 2.1% and 1.7 % higher than those of the LPC mel-cepstrum, respectively. Furthermore, this performance is slightly superior to that of MFCC.
منابع مشابه
An investigation of cepstral parameterisations for large vocabulary speech recognition
We examined variants of MFCC and PLP cepstral parameterisations in the context of large vocabulary continuous speech recognition under di erent acoustical environmental conditions: Compared to MFCC, mel-frequency PLP uses a cubic root intensity-toloudness law, and an LPC analysis is applied to the mel-warped spectrum. In LPC-smoothed MFCC, the only di erence to MFCC is the additional LPC smooth...
متن کاملAn efficient mel-LPC analysis method for speech recognition
This paper proposes a simple and e cient time domain technique to estimate an all-poll model on a mel-frequency axis (Mel-LPC). This method requires only two-fold computational cost as compared to conventional linear prediction analysis. The recognition performance of mel-cepstral parameters obtained by the Mel LPC analysis is compared with those of conventional LP mel-cepstra and the melfreque...
متن کاملAn adaptive MEL-LPC analysis for speech recognition
This paper describes a new speech analysis method, an adaptive Mel-LPC (AMLPC) analysis method, using human auditory characteristics. The Mel-LPC analysis method that we have proposed is an efficient time domain technique to estimate the warped predictors from input speech directly. However, the frequency resolution of spectrum obtained by Mel-LPC analysis is constant regardless of the characte...
متن کاملA comparison of LPC and FFT-based acoustic features for noise robust ASR
Within the context of robust acoustic features for automatic speech recognition (ASR), we evaluated mel-frequency cepstral coefficients (MFCCs) derived from two spectral representation techniques, i.e. the fast Fourier transform (FFT) and linear pre dictive coding (LPC). ASR systems based on the two feature types were tested on a digit recognition task using continuous density hidden Markov ph...
متن کاملGeneralized mel frequency cepstral coefficients for large-vocabulary speaker-independent continuous-speech recognition
The focus of a continuous speech recognition process is to match an input signal with a set of words or sentences according to some optimality criteria. The first step of this process is parameterization, whose major task is data reduction by converting the input signal into parameters while preserving virtually all of the speech signal information dealing with the text message. This contributi...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2001